2/9/23
Q: When is the presentation?
A: Discussing this today! Written reports will be throughout the rest of the quarter. Oral presentation will be part of your final project. These will be able to be recorded or given live in person during finals week.
Q: Will we have another lab with as many questions as Lab 4? The turnaround was pretty stressful, so just want to be prepared.
A: The next lab (multiple linear regression) is also a tady lengthy, but after that I don’t plan on the rest being quite as long. Just as a reminder that you do not need to complete the entire lab to receive credit!
Generate a visualization that will allow readers to determine whether male or female penguins are larger (by mass).
Generate a barplot that visualizes how many penguins there are from each species on each island. Each island should be a different panel (in a 1 row x 3 columns visualization), and each chart should visualize the species count.
Generate a scatterplot that will allow the viewer to determine whether flipper length has differed over time. Be sure to color the points on this plot by species.
::: panel-tabset
Imitation is the highest form of flattery
# Eric890916
chessData <- data.frame(country = c("United States", "Germany", "Canada", "Spain", "Russia", "France", "Bosnia and Herzegovina", "Croatia", "Turkey", "Austria"),
num = c(89, 55, 44, 41, 36, 34, 32, 32, 31, 29))
ggplot(chessData, aes(y = reorder(country, num), x = num)) +
geom_col(fill = "#008080") +
geom_text(aes(label = num), hjust = 1, nudge_x = -.5) +
labs(title = "More players transfer to the U.S. than to any other country",
subtitle = "Nations that received the highest number of player transfers, 2000-17",
caption = "2017 data as of April 11. SOURCE: FIDE",
x = "NUMBER OF TRANSFERS", y = "COUNTRY")# JulianBouchard
# recreation of the data
parents <- tibble(
category = c(rep("", 100), rep("BY RACE", 400), rep("BY INCOME", 300)),
name = c(
rep("All parents", 100),
rep("Black", 100), rep("Hispanic", 100), rep("White", 100), rep("Asian", 100),
rep("Lower income", 100), rep("Middle income", 100), rep("Upper income", 100)
),
val = c(
rep("all", 25), rep("some", 58), rep("other", 17),
rep("all", 39), rep("some", 44), rep("other", 17),
rep("all", 39), rep("some", 49), rep("other", 12),
rep("all", 18), rep("some", 63), rep("other", 19),
rep("all", 13), rep("some", 57), rep("other", 30),
rep("all", 38), rep("some", 49), rep("other", 13),
rep("all", 21), rep("some", 60), rep("other", 19),
rep("all", 14), rep("some", 65), rep("other", 21)
)
)
parents |>
# adjust the order of all names
mutate(
name = fct_rev(fct_relevel(
name,
"All parents",
"Black",
"Hispanic",
"White",
"Asian",
"Lower income",
"Middle income",
"Upper income"
))) |>
# adjust the order of the categories
mutate(
category = fct_relevel(
category,
"",
"BY RACE",
"BY INCOME"
)) |>
# adjust the order in which the 'all', 'some' or 'other' is shown
mutate(
val = fct_relevel(
val,
"all",
"some",
"other"
)) |>
ggplot(aes(y = name, fill=val)) +
geom_bar(position = "fill") +
scale_x_reverse() +
labs(title = "Black and Hispanic parents are more likely to enjoy parenting",
subtitle = str_wrap("Share of American parents who said they find being a parent enjoyable, by income group, race and overall", 80),
caption = str_wrap("White, Black and Asian parents include those who report being only one race and are not Hispanic. Hispanic parents are of any race. Middle income is defined as two-thirds to double the median annual family income for panelists in Pew's American Trends Panel. Lower income falls below that range and upper income falls above it.", 110),
fill="",
color = "Species") +
facet_wrap(~category, ncol = 1) +
scale_fill_manual(
values=c("#066971", "#6dcbcd", "#e0e0e0"),
labels=c("ENJOYABLE ALL OF THE TIME", "ENJOYABLE SOME OF THE TIME", "OTHER")) +
theme_classic() +
# manual theme adjustments
theme(axis.title.x = element_blank(),
axis.text.x = element_blank(),
axis.title.y = element_blank(),
legend.position="top",
plot.title = element_text(face = "bold"),
strip.text.x = element_text(size = 9, color = "black", face = "bold"),
plot.caption = element_text(hjust = 0)
)# ckwon822
common_first_names <- read.csv("https://raw.githubusercontent.com/fivethirtyeight/data/master/most-common-name/new-top-firstNames.csv")
# editing data
common_first_names <- common_first_names[1:20, ]
common_first_names <- common_first_names %>%
mutate(sex = case_when (name == "Mary" |
name == "Jennifer" |
name == "Patricia" |
name == "Linda" |
name == "Elizabeth" ~ "female",
name != "Mary" |
name != "Jennifer" |
name != "Patricia" |
name != "Linda" |
name != "Elizabeth" ~ "male",),
percentage = round(newPerct2013 * 1000, digits = 1))
# creating visualization
common_first_names %>%
ggplot(aes(y = reorder(name, percentage), x = percentage, fill = sex)) +
geom_histogram(stat = "identity") +
guides(fill = "none") +
annotate("text", x = 9.65, y = 21.7, label = expression(bold("MALE")), cex = 3.85, hjust = 1, vjust = 1, color = "dodgerblue") +
annotate("text", x = 11.5, y = 21.7, label = expression(bold("FEMALE")), cex = 3.85, hjust = 1, vjust = 1, color = "gold1") +
geom_text(aes(label = signif(percentage)), nudge_x = 0.5) +
labs(title = "Most Common First Names",
subtitle = "Per 1,000 Americans as of 2013") +
scale_fill_manual(values = c("male" = "dodgerblue",
"female" = "gold1")) +
theme_classic() +
theme(plot.title.position = "plot",
panel.grid.major.y = element_blank(),
plot.title = element_text(size = 16,
face = "bold"),
plot.subtitle = element_text(size = 11),
axis.ticks.x = element_blank(),
axis.ticks.y = element_blank(),
axis.line.x = element_blank(),
axis.line.y = element_blank(),
axis.text.x = element_blank(),
axis.text.y = element_text(color = "black"),
axis.title.x = element_blank(),
axis.title.y = element_blank())Take a Sad Plot & Make It Better
# ckwon
medals <- tibble(
country = c(
rep("USA", 79), rep("CHN", 70), rep("ROC", 53), rep("GBR", 48), rep("JPN", 40)),
medal_type = c(
rep("gold", 25), rep("silver", 31), rep("bronze", 23),
rep("gold", 32), rep("silver", 22), rep("bronze", 16),
rep("gold", 14), rep("silver", 21), rep("bronze", 18),
rep("gold", 15), rep("silver", 18), rep("bronze", 15),
rep("gold", 21), rep("silver", 7), rep("bronze", 12)))
# creating visualization
medal_viz <- medals %>%
mutate(country = factor(country, levels = c("JPN", "GBR","ROC", "CHN", "USA"))) %>%
ggplot(aes(y = country, fill = factor(medal_type, levels = c("bronze", "silver", "gold")))) +
geom_bar() +
annotate("text", x = 4.5, y = 5.05, label = "25", cex = 4, hjust = 1, vjust = 1) +
annotate("text", x = 30.5, y = 5.05, label = "31", cex = 4, hjust = 1, vjust = 1) +
annotate("text", x = 61.5, y = 5.05, label = "23", cex = 4, hjust = 1, vjust = 1) +
annotate("text", x = 86.5, y = 5.05, label = expression(bold("79")), cex = 5, hjust = 1, vjust = 1) +
annotate("text", x = 4.5, y = 4.05, label = "32", cex = 4, hjust = 1, vjust = 1) +
annotate("text", x = 37.5, y = 4.05, label = "22", cex = 4, hjust = 1, vjust = 1) +
annotate("text", x = 58.5, y = 4.05, label = "16", cex = 4, hjust = 1, vjust = 1) +
annotate("text", x = 76.5, y = 4.05, label = expression(bold("70")), cex = 5, hjust = 1, vjust = 1) +
annotate("text", x = 4.5, y = 3.05, label = "14", cex = 4, hjust = 1, vjust = 1) +
annotate("text", x = 18.5, y = 3.05, label = "21", cex = 4, hjust = 1, vjust = 1) +
annotate("text", x = 39.5, y = 3.05, label = "18", cex = 4, hjust = 1, vjust = 1) +
annotate("text", x = 59.5, y = 3.05, label = expression(bold("53")), cex = 5, hjust = 1, vjust = 1) +
annotate("text", x = 4.5, y = 2.05, label = "15", cex = 4, hjust = 1, vjust = 1) +
annotate("text", x = 19.5, y = 2.05, label = "18", cex = 4, hjust = 1, vjust = 1) +
annotate("text", x = 38, y = 2.05, label = "15", cex = 4, hjust = 1, vjust = 1) +
annotate("text", x = 54.5, y = 2.05, label = expression(bold("48")), cex = 5, hjust = 1, vjust = 1) +
annotate("text", x = 4.5, y = 1.05, label = "21", cex = 4, hjust = 1, vjust = 1) +
annotate("text", x = 23.5, y = 1.05, label = "7", cex = 4, hjust = 1, vjust = 1) +
annotate("text", x = 32.5, y = 1.05, label = "12", cex = 4, hjust = 1, vjust = 1) +
annotate("text", x = 46.5, y = 1.05, label = expression(bold("40")) , cex = 5, hjust = 1, vjust = 1) +
labs(title = "Medals Won at the Tokyo Olympics (ongoing)",
subtitle = "Distribution of medals won by the top 5 countries (ordered by total)",
fill = "Medal Type") +
scale_fill_manual(values = c("gold" = "gold",
"silver" = "gray75",
"bronze" = "tan3")) +
theme(#legend.title = element_text(face = "bold"),
legend.position = "top") +
guides(fill = guide_legend(title.position = "top")) +
theme_classic() +
theme(plot.title.position = "plot",
panel.grid.major.y = element_blank(),
plot.title = element_text(size = 16,
face = "bold"),
plot.subtitle = element_text(size = 11),
axis.ticks.x = element_blank(),
axis.ticks.y = element_blank(),
axis.line.x = element_blank(),
axis.line.y = element_blank(),
axis.text.x = element_blank(),
axis.text.y = element_text(color = "black",
#face = "bold",
size = 11),
axis.title.x = element_blank(),
axis.title.y = element_blank())
medal_viz +
theme(legend.position = c(0.8, 0.25))tidyverseFor each case study (2), during lecture:
For each case study:
With your group, you will:
You’ll need to do something more on the topic beyond what is presented in class.
Examples:
Graded on:
Two possible Paths: